Phoneme-to-grapheme Conversion for Out-of-vocabulary Words in Large Vocabulary Speech Recognition
نویسندگان
چکیده
In this paper, we describe a method to enhance the readability of the textual output in a large vocabulary continuous speech recognition system when out-of-vocabulary words occur. The basic idea is to replace uncertain words in the transcriptions with a phoneme recognition result that is postprocessed using a phoneme-to-grapheme converter. This converter turns phoneme strings into grapheme strings and is trained using machine learning techniques. Experiments show that, even when the grapheme strings are not fully correct, the resulting transcriptions are more easily readable than the original ones.
منابع مشابه
Optimizing phoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition
In this report, we present the results of further research on phoneme-to-grapheme (P2G) conversion for Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, in large vocabulary speech recognition. First, we summarize the results of previous research, and then we start with reporting on several optimization strategies for the Machine Learning technique we used to carry out P2G co...
متن کاملPhoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition
In this report, we show that Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, can be reasonably reliably transcribed orthographically using Machine Learning techniques. More specifically, (i) we show baseline performance of a machine learning approach to phoneme-to-grapheme conversion when different levels of artificial noise are added (simulating phoneme recognizer errors)...
متن کاملMemory-Based Phoneme-to-Grapheme Conversion A Method for Dealing with Out-of-Vocabulary Items in Speech Recognition
In this paper, we describe a method to enhance the readability of out-of-vocabulary items (OOVs) in the textual output in a large vocabulary continuous speech recognition system. The basic idea is to indicate uncertain words in the transcriptions and replace them with phoneme recognition results that are post-processed using a phoneme-to-grapheme (P2G) converter. We concentrate on the final ste...
متن کاملA Method for Dealing with Out-of-Vocabulary Items in Speech Recognition
In this paper, we describe a method to enhance the readability of out-of-vocabulary items (OOVs) in the textual output in a large vocabulary continuous speech recognition system. The basic idea is to indicate uncertain words in the transcriptions and replace them with phoneme recognition results that are post-processed using a phoneme-to-grapheme (P2G) converter. We concentrate on the final ste...
متن کاملMultigram-based grapheme-to-phoneme conversion for LVCSR
Many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, spoken document retrieval, . . . ) Recognition of (new) words requires acoustic baseforms for them to be known. Commonly words are transcribed manually, which poses a major burden on vocabulary adaptation and interdomain portability. In this work we investigate the possib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001